GEOMETRIC SEPARATION BY 
SINGLE-PASS ALTERNATING THRESHOLDING 



GITTA KUTYNIOK 

Abstract. Modern data is customarily of multimodal nature, and analysis tasks typically 
require separation into the single components. Although a highly ill-posed problem, the 
morphological difference of these components sometimes allow a very precise separation such 
as, for instance, in neurobiological imaging a separation into spines (pointlike structures) and 
dendrites (curvilinear structures). Recently, applied harmonic analysis introduced powerful 
methodologies to achieve this task, exploiting specifically designed representation systems 
in which the components are sparsely representable, combined with either performing £i 
minimization or thresholding on the combined dictionary. 

In this paper we provide a thorough theoretical study of the separation of a distributional 
model situation of point- and curvilinear singularities exploiting a surprisingly simple single- 
pass alternating thresholding method applied to the two complementary frames: wavelets 
and curvelets. Utilizing the fact that the coefficients are clustered geometrically, thereby 
exhibiting clustered/geometric sparsity in the chosen frames, we prove that at sufficiently 
fine scales arbitrarily precise separation is possible. Even more surprising, it turns out that 
the thresholding index sets converge to the wavefront sets of the point- and curvilinear 
singularities in phase space and that those wavefront sets are perfectly separated by the 
thresholding procedure. Main ingredients of our analysis are the novel notion of cluster 
coherence and clustered/geometric sparsity as well as a microlocal analysis viewpoint. 



1. Introduction 

Along with the deluge of data we face today, it is not surprising that the complexity of 
such data is also increasing. One instance of this phenomenon is the occurrence of multi- 
ple components, and hence, analyzing such data typically involves a separation step. One 
most intriguing example comes from neurobiological imaging, where images of neurons from 
Alzheimer infected brains are studied with the hope to detect specific artifacts of this disease. 
The prominent parts of images of neurons are spines (pointlike structures) and dendrites 
(curvelike structures), which require separate analyzes, for instance, counting the number of 
spines of a particular shape, and determining the thickness of dendrites [31, 34]. 
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From an educated viewpoint, it seems almost impossible to extract two images out of one 
image; the only possible attack point being the morphological difference of the components. 
The new paradigm of sparsity, which has lately led to some spectacular successes in solving 
such underdetermined systems, does provide a powerful means to explore this difference. The 
main sparsity-based approach towards solving such separation problems consists in carefully 
selecting two representation systems, each one providing a sparse representation of one of 
the components and both being incoherent with respect to the other - the encoding of the 
morphological difference -, followed by a procedure which generates a sparse expansion in the 
dictionary combining the two representation systems. This intuitively automatically forces 
the different components into the coefficients of the 'correct' representation system. 

Browsing through the literature, the two main sparsity-based separation procedures can be 
identified to be li minimization (see, e.g., [2, 15, 16, 17, 19, 20, 21, 22, 23, 27, 36, 37, 38, 40]) 
and thresholding (see, e.g., [1, 21, 32, 33]). For general papers on £i minimization techniques 
we refer to [7, 9, 14, 13, 12] and thresholding to [39] or the reference list in the beautiful survey 
paper [3]. While ^l minimization has produced very strong theoretical results, thresholding 
is typically significantly harder to analyze due to its iterative nature. However, thresholding 
algorithms are in general much faster than li minimization, which makes them particularly 
attractive for the aforementioned neurobiological imaging application due to its large problem 
size. 

In this paper we focus on thresholding as a separation technique for separating point- 
from curvelike structures using radial wavelets and curvelets; in fact, we study the very sim- 
ple technique of single-pass alternating thresholding, which expands the image in wavelets, 
thresholds and reconstructs the point part, then expands the residual in curvelets, thresh- 
olds and reconstructs the curve part. In this paper we aim for a fundamental mathematical 
understanding of the precision of separation allowed by this thresholding method. Interest- 
ingly, our analysis requires the notions of cluster coherence and clustered/ geometrical spar- 
sity, which were introduced in [18] by the author and Donoho in the context of analyzing £i 
minimization as a separation methodology. 

We find the results in our paper quite surprising in two ways. First, the thresholding 
procedure we consider is very simple, and researchers on thresholding algorithms might at 
first sight dismiss such single-pass alternating thresholding methodology. Therefore, it is 
intriguing to us, that we derive a quite similar perfect separation result (Theorem 1.1) as 
in our paper [18], where £i minimization as a separation technique was analyzed. Secondly, 
to our mind, it is even more surprising that in Theorems 1.2 and 1.3 we derive even more 
satisfying results by showing that the thresholding index sets converge to the wavefront 
sets of the point- and curvilinear singularities in phase space and that those wavefront sets 
are perfectly separated by the thresholding procedure. This, we already suspected for 
minimization to be true. However, we are not aware of any analysis tools strong enough to 
derive these results for separation by minimization. 

1.1. A Geometric Separation Problem. Let us start by defining the following simple 
but clear model problem of geometric separation (compare also the problem posted in [18]). 
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Consider a 'pointlike' object V made of point singularities: 

p 

V = ^\x - Xi\-'"\ (1.1) 
1=1 

This object is smooth away from the P given points {xi : 1 < i < P). Consider as well a 
'curvelike' object C, a singularity along a closed curve r : [0, 1] H- R^: 

C = j 6^^t){-)dt, (1.2) 

where is the usual Dirac delta function located at x. The singularities underlying these 
two distributions are geometrically quite different, but the exponent 3/2 is chosen so the 
energy distribution across scales is similar; if Ar denotes the annular region r < |^| < 2r, 

/ iVl'iO^r, [ |C|2(0xr, r^oo. 

JAr J Ar 

This choice makes the components comparable as we go to finer scales; the ratio of energies 
is more or less independent of scale. Separation is challenging at every scale. 
Now assume that we observe the 'Signal' 

f = V + C, 

however, the component distributions V and C are unknown to us. 

Definition 1.1. The Geometric Separation Problem requires to recover V andC from know- 
ledge only of f ; here V and C are unknown to us, but obey (1-1), (1-2) and certain regularity 
conditions on the curve r. 

As there are two unknowns {V and C) and only one observation (/), the problem seems 
improperly posed. We develop a principled, rational approach which provably solves the 
problem according to clearly stated standards. 

1.2. Two Geometric Frames. We now focus on two overcomplete systems for representing 
the object /: 

• Radial Wavelets - a tight frame with perfectly isotropic generating elements. 

• Curvelets - a highly directional tight frame with increasingly anisotropic elements at 
fine scales. 

We pick these because, as is well known, point singularities are coherent in the wavelet frame 
and curvilinear singularities are coherent in the curvelet frame. In Section 1.5 we discuss 
other system pairs. For readers not familiar with frame theory, we refer to [10, 8], where 
terms like 'tight frame' - a Parseval-like property - are carefully discussed. 

The point- and curvelike objects we defined in the previous subsection are real-valued 
distributions. Hence, for deriving sparse expansions of those, we will consider radial wavelets 
and curvelets consisting of real-valued functions. So only angles associated with radians 
6 G [0, tt) will be considered, which later on we will, as is customary, identify with P^, the 
real projective line. 

We now construct the two selected tight frames as follows. Let W{r) be an 'appropri- 
ate' window function, where in the following we assume that W belongs to C°°(R) and is 
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compactly supported on [—2, —1/2] U [1/2, 2] while being the Fourier transform of a wavelet. 
For instance, suitably scaled Lemarie-Meyer wavelets possess these properties. We define 
continuous radial wavelets at scale a > and spatial position b G by their Fourier 
transforms 

^PaA0 = a-Wia\^\)-exp{tb'O. 

The wavelet tight frame is then defined as a sampling of 6 on a series of regular lattices 
{ajZ^}, j > jo, where aj = , i.e., the radial wavelets at scale j and spatial position 
k = [ki, ^2)' are given by the Fourier transform 

M0 = ^'' ■Wm/2^)-exp{tk'^/2^}, 

where we let A = (j, k) index position and scale. 

For the same window function W and a 'bump function' V, we define continuous curvelets 
at scale a > 0, orientation 6 G [0, vr), and spatial position b G by their Fourier transforms 

laMO = «^ ■ W{a\^\)V{a-'/\co - 9)) ■ exp{z6'0- 

See [4, 5] for more details. The curvelet tight frame is then (essentially) defined as a sampling 
of 6 on a series of regular lattices 

where Rg is planar rotation by 9 radians, aj = 2~\ 9j^i = 7r£/2-'/^, £ = 0, . . . , 2-^/^ — 1, and Da 
is anisotropic dilation by diag{a, ^/a), i.e., the curvelets at scale j, orientation £, and spatial 
position k = {ki, /C2) are given by the Fourier transform 

%(0 = 2-^1 ■ WmmV{{uj - ^,,,)2^-/2) . exp{z(i?,^,Z}2-. 

where let 77 = {j,k,i) index scale, orientation, and scale. (For a precise statement, see [6, 
Section 4.3, pp. 210-211]). 

Roughly speaking, the radial wavelets are 'radial bumps' with position k/2^ and scale 2~\ 
while the curvelets live on anisotropic regions of width 2~^ and length 2"-'/^. The wavelets 
are good at representing point singularities while the curvelets are good at representing 
curvilinear singularities. 

Using the same window W, we can construct a family of filters Fj with transfer functions 

m) = wmm, e e r^. 

These filters allow us to decompose a function g into pieces gj with different scales, the piece 
gj at subband j arises from filtering g using Fj: 

9j = Fj-kg- 

the Fourier transform cjj is supported in the annulus with inner radius 2^~^ and outer radius 
2^^^. Because of our assumption on W , we can reconstruct the original function from these 
pieces using the formula 

g = Y,F,^g,, geL\Ye). 
j 
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We now apply this filtering to our known image /, obtaining the truly geometric decom- 
position 

f^=F,^f = F,^{V + C)=V,+q 

for each scale j. 

For future use, let Aj denote the collection of indices (j, k) of wavelets at level j, and let 
Aj denote the indices rj = {j, k, £) of curvelets at level j. 

1.3. Separation via Thresholding. We now consider a simple 'one-step-thresholding' 
method - which we also refer to as 'single pass alternating thresholding' method - formalizing 
the first few steps of a recipe for separation pointed out by Coifman and Wickerhauser [11, 
Fig. 26(a-h)] (cf. also [18]). It is formally specified in Figure 1. 



One-Step-Thresholding 

Parameters: 

• Filtered signal fj for a scale j. 

• Thresholding parameter e < 1/ 64. 

Algorithm: 

1) Threshold Wavelet Coefficients: 

a) Obtain wavelet coefficients cx = {fj,ip\), X G A^. 

b) Apply threshold to obtain the set of significant coefficients 
Tij = {A : |ca| > 2'^. 

2) Reconstruct Wavelet Component and Residualize: 

a) Set = EAeTi,, 

b) Set TZj = fj - Wj = Y^xaT^^ ^a^a- 

3) Threshold Curvelet Coefficients of Residual: 

a) Compute dr, = {TZj,'~in) , 7 G Aj. 

b) Apply threshold to obtain the set of significant coefficients 
Ts,, = : > 2^(1/4"-)}. 

4) Reconstruct Curvelet Component: 

a) Compute Cj = Y.r,&r2,, ^'n'^v 

Output: 

• Sets of significant coefficients: Tij and % j. 

• Approximations to Vj and Cj: Wj and Cj. 



Figure 1. One-Step Thresholding Algorithm to approximately decompose 
f,=V,+C,. 
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One-Step is a very simple, easily implementable way to approximately decompose the 
signal fj into purported pointlike and curvelike parts. Currently popular thresholding algo- 
rithms are usually far more complex than One- STEP: they apply similar operations multiple 
times, with stopping rules, threshold adaptation, etc. It therefore may be surprising that this 
very simple noniterative algorithm, with nonadaptive threshold, also works well. The thresh- 
olds are even almost chosen as if the data wouldn't be composed at all: The first threshold 
2^^ is chosen coarsely below the decay rate 0(2-'/^) of significant wavelet coefficients of the 
'naked' point singularity Vj] the second threshold 2-''^^/^"^^ is chosen just slightly below the 
decay rate 0(2-'/^) of significant curvelet coefficients of the 'naked' curvilinear singularity 
Cj. Notice that we threshold the wavelet component more aggressively; and we refer to 
Section 2.3 for more precise heuristics on the choice of these two thresholds. It comes as 
a second surprise that our estimates as well as the framework of geometric separation are 
strong enough to survive this 'brutally simple' thresholding strategy, as it is shown in the 
following result as well as Theorems 1.2 and 1.3. 

For the following result, which will be proven in Section 5.3, we continue to suppose the 
sequence {fj)j is known; thus the ideal decomposition into a pointlike and curvelike part 
would be given by fj = Vj + Cj. We apply One- Step, which outputs approximations Wj 
and Cj to Vj and Cj, respectively. 

Theorem 1.1. ASYMPTOTIC Separation via One- Step Thresholding. 

\m-v,\\2+\\c,-c,\\2 



l^,l|2+||C,| 



-> 0, j ^ oo. 



It is well-known that ii minimization and thresholding are closely connected in various 
ways. In the past few years it has been frequently found that results on successful ii min- 
imization subsequently inspired parallel results on thresholding methods. In a particular 
sense, this happened here as well; after obtaining an asymptotic separation result using £i 
minimization (cf. [18]), we found a similar result for this surprisingly simple thresholding 
procedure. However, even more intriguingly, when performing this thresholding procedure 
- as opposed to ii minimization - we are able to even derive much more satisfying results 
than Theorem 1.1, which we turn our attention to now. 

1.4. Wavefront Set Separation. The very simplicity of One-Step makes it possible to 
analyze delicate phenomena which do not seem analytically tractable for iterative threshold- 
ing or even for the ii minimization problem considered in [18]. 

The geometric separation model we have been studying is distinguished by the behavior 
of its singularities. One might hope that the two purported geometric components C and 
P, defined by 

P = J2Fy^W, and C = J2f^^C„ 
j j 
have exactly the singularities that one expects. To articulate this goal requires the notions 
of wavefront set and phase space from microfocal analysis, which are reviewed below and in 
Section 2. Intuitively, phase space is the collection of location/direction pairs and the wave- 
front set WF{f) of a distribution is the subset of phase space where / exhibits singularities. 
Point singularities are omnidirectional, while curvilinear singularities point in one direction. 
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Theorem 1.1 shows that the distributions V and C can be arbitrarily well approximated by 
thresholding - a similar result was derived in our companion paper [18] for li minimization. 
However, the most desirable and also rhetorically effective matching condition would be an 
arbitrarily perfect approximation also of the associated wavefront sets WF(V) and WF{C). 

Surprisingly, we derive two results in this direction for One-Step - one on the 'analysis' 
side and the other on the 'synthesis' side. The first result shows that the wavefront sets of 
V and C can indeed be approximated with arbitrary high precision by the significant thresh- 
olding coefficients Ti^f and T^^'^- As a measure of distance we employ the nonsymmetric 
Hausdorff-style distance d{A, B), say, in phase space measuring the largest distance from any 
point of a subset of phase space A to the closest corresponding point of a different subset B. 
As a second result, we prove that the wavefront sets of the synthesized objects Fj -k Cj 
and Fj Pj coincide with WF{V) and WF[C), respectively. We might interpret this 
result as recovering WF{V) and WF{C) from the composed image /, hence in this sense we 
do not only separate the pointlike structures from the curvelike structures, but even more 
separate their wavefront sets. 

For a precise statement of the aforementioned two results, we require to introduce some no- 
tions from microlocal analysis, which will be our main analysis methodology. Phase space is 
the space of all direction/location pairs {b, 6), where b G and the orientational component 
6 will be regarded as an element in P^, the real projective space^ in R^. 

Since radial wavelets are oriented in all directions, we denote the set of significant phase 
space pairs produced by the wavelet component of algorithm One- Step by 

ri7 = fe:(j,fc)eri,}xpi; (1.3) 

the set of significant phase space pairs for the curvelet component of One-Step is: 

= {ihk,e, ■■ (j, k, i) G (1-4) 
We further require the notion of a metric in phase space, which we choose to be 

dpsiib, 9), {b', 9')) ={\\b-b'\\l + \9- 9'\'Y^' , (6, 9), (6', 9') G R^ x P\ 
and its associated asymmetric distance 

dpsiC, C') = max min lie — c'lU, where C, C' C R^ x P"^. 

cec c'GC" 

Section 6 then proves the following theorem. 
Theorem 1.2. Approximation of the Wavefront Sets. 
(i) 

limsup dps{r{f,WF{V)) = 0. 

J-5>00 



Here we identify P"'^ with [0, tt) and freely write one or the other in what foUows. It may at first seem 
more natural to think of directions [0, 2n) rather than orientations [0, tt), note however that in this paper we 
consider rea^-^a/uerf distributions V+C measured by real- valued curvelets 7,; so directions are not resolvable, 
only orientations. We also frequently abuse notation as follows: we will write 16* — 0'| when what is actually 
meant is geodesic distance between two points on P^. 
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limsup dpsir^J^WFiC)) = 0. 



In short, the significant coefficients in each purported geometric component cluster increas- 
ingly around the wavefront set of the underlying 'true' geometric component. We further 
derive the following result (proved in Section 7). 

Theorem 1.3. Separation of the Wavefront Sets. 



This implies that the wavefront sets of the reconstructed components are precisely what 
we might hope for. 

It seems plausible that results similar to Theorems 1.2 and 1.3 could hold, in particular, 
also for separation via ii minimization, but we don't know of analytical tools powerful enough 
to prove this. 

1.5. Extensions. We would like to point out that the analysis of One-Step for solving the 
special separation problem we focus on in this paper, gives rise to very extensive generaliza- 
tions and extensions; a few examples are stated in the sequel. 

• More General Classes of Objects. Theorems 1.1-1.2 can be generalized to other 
situations. First, we could consider singularities of different orders. This would 
allow C to model 'cartoon' images, where the curvilinear singularities are now the 
boundaries of the pieces for piecewise functions. Second, we can allow smooth 
perturbations, i.e., f = {V + C + g) ■ h where g, h are smooth functions of rapid decay 
at oo. In this situation, we let the denominator in Theorem 1.1 be simply ||/j ||2- 

• Other Frame Pairs. Theorems 1.1-1.2 hold without change for many other pairs of 
frames and bases, such as, e.g., by [28], for the pair orthonormal separable Meyer 
wavelets and shearlets (cf. [24, 29, 26, 30]). 

• Noisy Data. Theorems 1.1-1.2 are resilient to noise impact; an image composed of V 
and C with additive 'sufficiently small' noise exhibits the same asymptotic separation. 

• Rate of Convergence. Theorem 1.1 can be accompanied by explicit decay estimates. 



The morphological difference between the two structures we intend to extract - points and 
curve - is the key to separation. In the section we will describe why heuristically this key 
issue makes separation possible as well as present our main means to choose the 'correct' 
thresholds. 

2.1. Point- and Curvelike Structures in Phase Space. Our intuition as well as hard 
analysis is based on a microlocal analysis viewpoint, which through the notion of wavefront 
sets will allow us to, roughly speaking, include the morphology of the structures by adding a 
third dimension to spatial domain. Let us start by recalling the notion of wavefront set and 
- related with this - the notion of singular supports and phase space. The singular support 
of a distribution /, sing supp(/), is defined to be the set of points where / is not locally 



WF{J2 ^ Wj) = WF{V) 



and WF{J2 * Cj) = WF{C). 



j 



j 



2. Microlocal Analysis Viewpoint 



GEOMETRIC SEPARATION BY SINGLE-PASS ALTERNATING THRESHOLDING 



9 



C°°. The notion of wavefront set then goes beyond the classical spatial domain picture and 
extends it to phase space, which consists of position-orientation pairs {b,6)] see the more 
detailed discussion in Section 1.4. The wavefront set WF{f) lives in this phase space and 
can be coarsely described as the set of position-orientation pairs at which / is nonsmooth; 
for more details, see: [25, 5, 29]. 

To illustrate these notions and also prepare our heuristic argument why separation through 
thresholding is possible, we first consider the distribution V. A short computation shows 
that 

sing supp(P) = {xi} and WF(V) = sing supp('P) x P^, 

which can be regarded as a manifestation of the isotropic nature of the point singularities. 
Illustrations of sing supp(P) and of WF{V) are presented in Figure 2. 




Figure 2. Left panel: singular support of V. Right panel: wavefront set of 
V in phase space. 

For the distribution C, we obtain 

sing supp(C) = imageir) and WF{C) = {(r(t), e{t)) : t e [0, L{t)]}, 

where r(t) is a unit-speed parametrization of C and 6{t) is the normal direction to C at r(t) 
regarded in P^. Here, the anisotropy and - in comparison with Figure 2 - the morphological 
difference to V becomes evident. An illustration of sing supp(C) and of WF{C) is presented 
in Figure 3. 

2.2. Wavelets and Curvelets in Phase Space. Although being smooth functions, in a 
certain sense, wavelets and curvelets can be regarded as leaving an approximate footprint in 
phase space. To make this statement rigorous, we first observe the approximate footprint in 
spatial domain left by wavelet and curvelets as detailed in the following two lemmata taken 
from [18]. As expected, these observations show the isotropic nature of wavelets in contrast 
to the anisotropic nature of curvelets. 

Lemma 2.1 ([18]). For each N = 1,2, . . . there is a constant Cn so that 

|^a,b(a;)| <CN-a-^-{\x-b\/a)-^, VaGR+V6,xGRl 
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k-0 



Figure 3. Left panel: singular support of C. Right panel: wavefront set of C 
in phase space. 

Lemma 2.2 ([18]). For each N = 1,2, . . . there is a constant cn so that 

ba,bAx)\ < cn ■ a-^'" ■ (|x - hl,e)~''. Va G R+ G [0,7r) \/h,x G 

Since it is known from [5] that the continuous curvelet transform precisely resolves the 
wavefront set of distributions, we might consider the image of wavelets and curvelets under 
the continuous curvelet transform for 'sufficiently small' scale as a footprint of these in 
phase space. An illustration is given in Figure 4, and for a detailed description we refer the 
interested reader to [18]. 




Figure 4. Left panel: wavefront set of the observed data f = V + C. Right 
panel: phase space footprint of radial wavelets and curvelets. 

Visually, wavelets are perfectly adapted to strongly react to P in a similar way as curvelets 
will strongly react to C. This will be now made precise and will lead to the chosen thresholds 
for separation. 
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2.3. Road Map to the 'Correct' Thresholds. To slowly approach a rigorous phrasing 
of the aforementioned strong reaction, we first consider the reaction of both wavelets to V 
and C. A simplified form of Lemma 3.2 states that, 

|(^a,,.„P,)|=0(2^-/2) ^.^^^ (2.1) 

with fast decay^ for other locations than Xj, and Lemma 3.3 shows that, for each j and 6, 

|(V',^,,,C,)|=0(1). (2.2) 

Secondly, turning our attention to curvelets and their reaction to V and C, we observe that, 
by a simplified form of Lemma 4.4, for each 9, 

|(7a„x.,.,P,)|=0(2^-/2) ^.^^^ (2.3) 

and, for h positioned on the curve r and 9 pointing in the direction perpendicular to the 
tangent to the curve in 6, 

|(7a„M,C,)| = 0(2^-/4) (2.4) 

Examining closely (2.1) and (2.2), it becomes immediately evident that the correct first 
threshold - which should capture Vj by thresholding wavelet coefficients of fj - need to be 
chosen 'slightly higher' than a constant asymptotically, wherefore we choose it equal to 2^^ 
for small e. 

The second threshold seem to be a somehow more serious problem, since (2.3) and (2.4) 
show that curvelets react stronger to a point singularity than a curvilinear singularity. How- 
ever, we wish the reader to keep in mind that ideally all energy from Vj is already captured 
during the first thresholding procedure. Hence, it should presumably be 'safe' to choose the 
second threshold - which shall capture Cj by thresholding curvelet coefficients of the residual 
IZj - with asymptotic behavior o{2^/^). To avoid unnecessary risks, we choose it only slightly 
below 2-'/^, more precisely, equal to 2-'^^/^"^^ 

2.4. What Type of Separation Result is Preferable? Applying now One-Step (cf. 
Figure 1) yields significant coefficient sets Tij and 72,j and approximations to Vj and Cj: 
Wj and Cj. In Sections 1.3 and 1.4, we presented three theorems on the 'quality' of this 
separation, which we would now like to discuss and compare. 

Theorem 1.1 studies the relative separation error and proves that asymptotically this error 
can be made arbitrarily small for sufficiently fine scale. This is in a sense the most natural 
question to ask, and the theorem provides the answer one would hope for. 

However, from a microlocal analysis viewpoint, the most satisfying separation to derive 
would be the perfect separation of the wavefront sets of V and C, i.e., to separate the 
LHS of Figure 4 into the RHS of Figures 2 and 3. This would be considerably 'stronger' 
than Theorem 1.1 in the following sense: Once the wavefront sets are extracted, we have 
complete information about the underlying singularities, in contrast to the merely asymptotic 
knowledge provided by Theorem 1.1. 

Knowledge about WF{V) and WF{C) could be either coming from Tij and 72j or from 
Wj and Cj. The sets of significant coefficients generated by thresholding do not provide 
an immediate means for separating the wavefront sets, since they live on the analysis side 



'As it is custom, we refer to the behavior 0{a^) as a — > for all = 1, 2, ... as fast decay. 
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(as opposed to the synthesis side). Astonishingly, they are still able to precisely locate the 
wavefront sets WF{V) and WF{C), more precisely, they 'converge' to the wavefront sets in 
phase space measured in the phase space norm dps as j — ?■ oo, which is the statement of 
Theorem 1.2. This shows that the points in Tij and T2J are located in tubes around WF{V) 
and WF{C), respectively, which become more concentrated around these wavefront sets as 
the scale becomes finer. The corresponding objects on the synthesis side, i.e., Wj and Cj, 
now allow separation of WF{V) and WF{C), in the sense that the wavefront sets of the 
reconstructed distributions Fj -k Wj and Fj ^ Cj precisely coincide with WF(V) and 
WF{C). This is the content of Theorem 1.3. 

3. Geometry of the Thresholded Wavelet Coefficients 

Following the ordering of the thresholding, we first focus on the set of significant radial 
wavelet coefficients Tij generated by Step 1) of One-Step-Thresholding (see Figure 1), 
in particular, on its phase space footprint, defined in (1.3) as 

r,^f = {beR':\{f„i^a„t)\>aJ^}xP\ 

Our objective will be to derive a tube around 7^^'^ in phase space with controllable 'size'. 
This tube should therefore be a neighborhood of WF{V), and hence be isotropic. 
For our analysis, we first notice that WLOG we can assume that 

V=\x\-^^^. (3.1) 

From here, the result for the original V as defined in (1.1) can be concluded because of the 
following reasons: Firstly, all results are translation invariant, hence instead of the origin the 
results follow immediately for a different point in spatial domain; and secondly, the change 
from one point to finitely many points just introduces a constant independent on j. 

3.1. Estimates for Wavelet CoefRcients. We start by analyzing the interaction of wavelet 
atoms. The technical proof of the following result is provided in Section 8.1 

Lemma 3.1. For each N = 1,2,..., there is a constant so that 

\{^a,b,i^ao,bo)\ < Cn ■ l{|log2{a/ao)|<3} ' ( 1 6 - &0 | /«) 

Next, we recall a result derived in [18] for radial wavelet coefficients of our point singularity 
(3.1). 

Lemma 3.2 ([18]). For each = 1, 2, . . . , there is a constant cn so that 

|(^a,,fe,P,)| < CN ■ a;'/' ■ (IV^I)"'^, Vj G Z V6 e le. 

In the sequel, we will further require an estimate of the wavelet coefficients of the curvilin- 
ear singularity Cj. Notice that the following estimate does only provide a very coarse upper 
bound. In order to derive a more detailed estimate, the curve would need to be much more 
carefully analyzed as it will be done in Section 4. However, the estimate as stated below is 
all we will require. 

Lemma 3.3. There exists a constant c so that 

|(^a,,6,C)l <c, VjeZV6eR2. 
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Proof. By Lemma 2.1 and the definition of tlie distribution C, 

\{il^a„b,C)\< f\^Pa,Armdt<CN-a-'- [\\T{t)-b\/a)-''dt. (3.2) 
^0 Jo 

WLOG we assume tliat b G t([0, 1]) witli r(0) = b, say, and we can also assume tliat 
b = {bi, 0). Choosing a ball Br{b) around b with r chosen arbitrarily small (yet, independent 
of j), there exists some < 5 < 1/2 such that 

t{[0,1]) n Br{b) = r{[l - 6,1]U [0,6]). 

This information is now used to split the last integral in (3.2) according to 

1 



{\r{t)-b\/aydt= {\T{t)-b\/a)-^dt+ / {\T{t) - bl/a^ dt =: h + h. 

J[1-5,1]U[0,S] J[5,l-5] 

(3.3) 

For estimating /i, we first observe that it is sufficient to consider J^^^j due to symmetry 
reasons. For r small enough, the curve inside Br{b) can be arbitrarily well approximated by 
its osculating circle with its center denoted hj z = (2^1,0). Combining these considerations 
as well as exploiting the approximation by a Taylor series for cosine, 

T{t) -b\/a)-^dt < c- [ {\{bi-zi){cos{t),sm{t)) + z-b\/a)-^dt 
[0,5] Jm 

(1 61 - Zi\^2{l-cos{t))/a)-^dt 

[0,5] 

< c' ■ [ {\bi- zi\-t/a)-^dt 

J[0,5] 

= c'-a/\bi-zi\ [ {ty^dt 

J[0,\bi^zi\5/a] 

< ■ a/\bi- zi\. (3.4) 
Using the definition of r, the integral I2 can be easily estimated as 

(|r(t) - 6|/a)-^dt < / {r/a^dt < (r/a)"^. (3.5) 

[<5,l-<5] J[5,l-5] 

Summarizing, by (3.2)-(3.5), there exists some constant c (independent on a and b) such 
that 

|(^a„b,C)| < cn ■ a-' ■ia/\b,-z^\ + (r/a)"^) < c. 
The lemma is proved. □ 

3.2. Geometry of Ti^f. We now first analyze the set Ti^f by the following two lemmata. 

-PS\c 

there is a constant cn so that 



Lemma 3.4. Let (6, 9) G {T^j'Y , and let j he sufficiently large. Then, for each N = 1,2, . . . , 



\b/aj\ > On ■ 2^^^N , 
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Proof. Let 6 G be such that 

|(V^a,,6,P,) + (V'a,,6,C,)|<2^^ 

Since by Lemma 3.3, \{ipaj,b^C)\, and hence | (V'aj^b, C^) |, is bounded by a constant c, say, we 
have 

Next we use the estimate in Lemma 3.2 as a model to conclude that 

(|6/a,r^<c^-2-^-/2(2^. + c). 

Thus, since for sufficiently large j, we have 2-'^ > c, 

\h/a,\ > {{cj, ■ + c))"2/^ - 1)'/' > (c^ ■ 2^-^ - l)'^' . 

Letting j be large enough so that (cAr/2) • 2-'^^^ > 1 proves the lemma. □ 
Lemma 3.5. Let (6, &) G T^f ■ Then, for each N = 1,2, ... , there is a constant cn so that 

\b/aj\ < On -2^^. 

Proof. Let 6 G be such that 

\{^lja„b,V,) + {^IJa„b,C,)\>2^^. 

Since by Lemma 3.3, \{ipaj,b,C)\, and hence \{ipaj,b,Cj)\, is bounded by a constant c, say, we 
have 

Next we use the estimate in Lemma 3.2 as a model to conclude that 

{\b/a,\r''>CM-2~^/\2^'-c). 
Thus, since for sufficiently large j, we have 2-'^^^ > c, 

\b/a,\ < {{or, ■ 2~^I\T^ - c)r'/'' - if' < (c^ ■ 2^"^^ . 

The lemma is proved. □ 
We certainly hope (and expect) that the threshold is set in such a way that the wavefront 
set of V is contained in T^^f. This is obviously the ffist requirement for being able to sepa- 
rate both wavefront sets WF{V) and WF{C) through Single-Pass Alternating Thresholding 
(compare Theorem 1.3). The next result shows that this is indeed the case. 

Lemma 3.6. For j sufficiently large, 
Hence, in particular. 
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Proof. By Parseval, 

|(P,,^.^,,o)| = 27r|(P,,^.^,o)| = 27r ■ 2^/' J W\\m\-'^'d^- 
Hence, we can conclude that, for j sufficiently large. 

This proves the ffist claim. 

For the 'in particular '-part, recall that WF{V) = {0} x P^. By Lemma 3.3 and the 
previous consideration, for sufficiently large j, 

The lemma is proved. □ 

4. Geometry of the Thresholded Curvelet Coefficients 

This section now aims to derive a fundamental geometric understanding of the cluster of 
curvelet coefficients 72,j generated by Step 3) of One-Step-Thresholding of the residual 
generated in Step 2) (see Figure 1). The phase space geometry will play an essential role in 
setting up the analysis correctly, hence it will be beneficial to study the projection of T2J 
onto phase space, defined in (1.4), as 

= {{b,9) eR'x [0,n) : |(7^„ 7a, ,6,^)1 > ^f'^'}- 
Morally, the points in phase space associated with significant curvelet coefficients, given by 
7'2^f, are contained in a tube around WF{wC) in phase space. The main objective will now 
be to explicitly define such a tube around the phase space footprint of this cluster, where 
we have more control on. This will become crucial for handling the thresholded curvelet 
coefficients in the proofs of Theorems 1.1-1.3. 

4.1. Bending the Curve. We first face the problem of how to deal with the curvilinear 
singularity. In [18], this problem was tackled by carefully and smoothly breaking the curve 
into pieces, bending each piece, and then combining pieces in the end. This technique shall 
also be applied here. For the convenience of the reader, we review the main ideas of this 
particular approach. 

First, a quantitative 'tubular neighborhood theorem' is being developed to allow local 
bending of the curve. Due to regularity of the curve, there exists some p small compared to 
the curvature of r, so that 

-(i+i)p 

\T"{t)\dt<e, z = o,...,i22iii^=:m. 

'{i-i)p 

Now consider the following local coordinate system in the vicinity of r. Let = ip, for 
i = 0, . . . ,m with r(to) = T{tm), since r is closed. Then we have the following 

Lemma 4.1 ([18]). (Tubular Neighborhood Theorem) For sufficiently small e > 0, 
there is some e' > so that, for X^/ = [—£:',£:'] x [—p,p], we have: 

• for each i = 0, . . . ,m, there exists a tube Y^, around r and an associated diffeomor- 
phism 0* : Y^, i-)- X^', 
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image[T 



Figure 5. The tubular neighborhood F^' 
ping : X^'. 



UiY^, of imageij) and the map- 



• the mapping 0* extends to a diffeomorphism from to which reduces to the 
identity outside a compact set. 

Thus, the set Y^i = UiY^, is a tubular neighborhood of image{T) on which we have nice 
local coordinate systems, see Figure 5. This will allow us to locally bend the curve r. From 
now on, 0* always denotes the extended diffeomorphism from R^ to R^. 

Next, choose a C°° function : R )■ [0, 1] supported in [—1, 1] satisfying 



(4.1^ 



and 



W2[ 



W2 



1{-I<t<0) and W2{i) + W2C-^) = 1 {0 < t < ^] 



Define a smooth partition of unity of [0, 1] using W2 by 

W2,i{t/p) = W2{it-ti)/p), l<i<m, 
and accordingly the distributions 



ti+i 



W2A't/p)^T(t)dt; 



the partition of unity property giving Y2i = C. 
Now consider the action of 0* on the distribution / 

(0r / = / o 0\ 

This action induces a linear transformation on the space of curvelet coefficients. With a{f) 
the curvelet coefficients of / and /?(/) the curvelet coefficients of (0*)*/, we obtain a linear 
operator 

M<^.(a(/)) = /?(/). 

It is by now well-known that diffeomorphisms preserve sparsity of frame coefficients when 
the frame is based on parabolic scaling (as with curvelets and shearlets), e.g., by [35] (see 
also [6, Theorem 6.1, page 219]), for any < p < 1, 



M4,\\op,p ■= max <^ sup || ((7r;, 0*7r?'))»yl|p, sup \\{{'yr„(f)*%')) 



rf IIP 



< 00. 
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After having carefully bended the curve pieces, we can also reserve the process and glue 
them together. Choosing, e.g., f3j = ((7^,Cj))^, from the decomposition Cj = YlT=i^] 
have 

m 
i=l 

This decomposition allows us to relate sparsity of coefficients of the linear singularity to 
those of the curvilinear singularity: 

WjWp < "^^^^ ■ (^max II M^. II op,pj ■ II lip. (4.2) 

Finally, we define the very special distribution wC we will consider, supported on a line 
segment {0} x [— p, p] by 

wC = W2{x2/p) ■ So{xi). 

Then we can write 

wC = w -k C, 

where 

u, = u,2(p6) ■ P ■ ^o(6) and £ = 5o(6)- 
Thus the action of wC on a continuous function / is given by 

2n{wC, f) = {C,w^ /) = J{w^ m^, 0)d^,. 

Conceptually, wC is a straight curve fragment, which the approach taken in [18] reduced the 
analysis of C to. 

Concluding this approach enables us to consider curvelet coefficients of a linear singularity 
instead of a curvilinear singularity with a linear operator mapping one coefficient set onto 
the other. 

4.2. Estimates for Curvelet CoefRcients. We start by estimating the interaction of 
curvelet atoms and the interaction of a curvelet atom with a wavelet atom. These results 
are proved in [18]. 

Lemma 4.2 ([18]). For each N = 1,2, . . . , there is a constant cn so that 

|(7a,M'7ao,6o,eo)l < " 1{| log2(a/ao)|<3} " l{|e-0o|<lOv^} " (|^ " ^o|ao,0o)~^- 

Lemma 4.3 ([18]). For each N = 1,2, . . . , there is a constant cn so that 

\{la,hfi^'^ao,bo)\ < Cn ■ a^^"^ ■ l{|log2(a/ao)|<3} " (| ^ - | a,e) 

We now first analyze curvelet coefficients of the point singularity V. The technical proof 
will be given in Subsection 8.2. 

Lemma 4.4. For each N = 1,2,..., there is a constant cn so that 

m,^a„b,o)\ < CN ■ ■ (|Di/,^.6|)-^, Vj e Z V6,^. 

Next we state two lemmata from [18], which provide estimates for the curvelet coefficients 
of our linear singularity by first considering curvelets, which are almost aligned with the 
singularity, and secondly considering the remaining ones. 



18 G. KUTYNIOK 

Lemma 4.5 ([18]). Suppose that 9 G [0, a/o], and set 

r := cos9sm9{a~^ — a~^), = bl{al — (Ti'^t), 

and 

2 _ / min{((p - b2)ai + a^^birf, ((-p - b2)ai + a^^irf} : 62 - frf^feir ^ [-p,p], 
^"1 : b2-a^%,Te[-p,p], 

where 

(Ti = sin^ 9 + a"^ cos^ 9^^^ and (T2 = (a"^ sin^ 9 + 0"^ cos^ 9)^^^. 
Then, for N = 1,2,..., 

\{wC,7aM,e)\ < CN ■ a-'/' ■ a^' ■ {d^r' ■ (1(^1,^1^2)!)'-^. 
In particular, if 9 = 0, 

\{wC,^a,b.e)\ < CM ■ a^'/' ■ {\b^/a\r' ■ (a^Ml^iP + ^^{(h - p)\ (^2 + pf}Y'Y~'' . 
and, if 9 = and 62 ^ [~P)P]j 

\{wC,^aAe)\<CN-a-'/'-{\b^/a\y-''. 
Lemma 4.6 ([18]). Suppose that 9 e {^/a,^) . Then, for N = 1,2, .. ., 

\{wC,-fa,b,e)\ < cl,m ■ ■ I cos^l ■ e-"^ ■ ■ (a^/^j g^^^j ^ ^Qg^j^L 



I sin 9 
' 2a 

■(|&2|)~'''- (p + a^/'|cos^| +a|sin^|)^^. 



4.3. Relation of T^j^ to Significant Coefficients from £1 Minimization. Comparing 
the set of significant coefficients T2,j we derive from thresholding with the set of significant 
coefficients associated with ii minimization studied in [18] will be quite beneficial, since it 
will later on allow us to exploit some of the results from this paper. 

To start, we briefiy review the definitions and choices made for the significant curvelet 
coefficients associated with ii minimization. Recalling the definition of the straight curve 
fragment wC from Section 4.1, we first define a neighborhood of WF{wC) by 

Af^^ia, c, e') = {6 e R2 : d2ib, {0} x [-2p, 2p]) < c ■ ^2(0, e')} x [0, v^], (4.3) 

where c > is some constant and 

D2{a,e') = a^^^'^'^ for some e' > 0. 

Then the set of significant curvelet coefficients for wC was in [18] chosen as 

4(c, e') = {{j, k,i)e U A,v : (6,,,,,, 9,,,) e A/'^^(a„ c, e')}. 

Let us now first consider the set 72j defined by 

f2, = {v:\{wC„^,)\>2^^'/^-^^}, 
which is related to Sj in the following way: 

Proposition 4.1. There exist ci,C2 > ande[,e2 G (0,£:) such that 

Sj{ci,e[) C f2j C 5j(c2,4)- 
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Proof. We first prove T2J C Sj{c2,e2). Using the estimate in Lemma 4.5 as a model, we 
obtain the following: For all {bj^k,e, with (j. A;, I) G 72,j and = 1, 2, . . ., we have 

c^r" ■ (rfi)^ ■ (1(^^1,^2)1) < CN ■ 2^'^. 

Now Lemma 4.6 implies that WLOG we only need to consider the case 6 G [0, ^/a\ due to 
the rapidly decaying exponential factor. To obtain an estimate for b, also WLOG we can 
assume that 6 = 0, which implies 



a'M^ . {\bJa\)T^ ■ (a"^[|6ip + min(62 ± p)T^^) < ■ 2^'^ 



which is equivalent to 

(|6i/a|)i^ ■ (a-i[|6i|' + min(62 ±p)T^') < cn ■ 2^'^. (4.4) 
Since both factors are larger than 1, we can split this inequality into 

{\bi/a\)7^ <cn -2^^ (4.5) 

and 

(a-i[|6iP + niin(62±p)T^') <cn-2^^. (4.6) 
From (4.5), we conclude that 

\bi\<c^-2~^^'-'\ (4.7) 

and from (4.6), we conclude that 

inin I62 ±p\<cn- 2~^^^-^\ (4.8) 

Thus, for C2 and 62 G (0, e) appropriately chosen, 

f2,j C 4(C2,4)- 

The converse inclusion can be derived by substituting (4.7) and (4.8) into (4.4). This 
proves the lemma. □ 

However, we wish to remind the reader that it is the set of significant curvelet coefficients 
of the curvilinear singularity we aim to analyze. For this reason, in the approach presented in 
[18], the aforementioned linear operator was exploited to obtain the set of significant curvelet 
coefficients of C based on the chosen set Sj{c, e') for wC For this, let Mp. = ((7^, Fj-k'jrj'))r],ri' 
be the filtering matrix associated with the filter Fj. The 'correct' linear operator to consider 
is defined by the matrix 

Mj = MiT^ ■M(<^,)-i, 

and the entries of this matrix will be denoted by Mj{ri,ri'). Further, we let denote the 
amplitude of the n'th largest element of the ?7"th column. Now setting 

Spies') = {r/ : r/' G S,{c,e') and |M;(r/,r/')| > V,2.4, 

the overall cluster set of significant curvelet coefficients of C is 



20 G. KUTYNIOK 

Highly technical and tedious computations (compare [18, Sec. 7]) - which we decided to 
not repeat here due to their non-intuitive nature - then imply the following result by using 
Proposition 4.1. 

Proposition 4.2. There exist ci,C2 > and e'i,e2 G (0,£:) such that 

C {ri : |(C„7.)| > 2^-(^/^-)} C S,,{c,,e',). 

This observation ensures that results from [18] concerning the set of significant curvelet 
coefficients are transferable to the situation under consideration in this paper; pleasing news 
which we intend to take advantage of. 

4.4. Geometry of T-ff. Our next goal is to show that instead of considering the set 72j 
which depends on the residual TZj - typically difficile to handle - we might consider the 
'easier-to-handle' set 

{r/:|(C„7,)|>2^W4-.')} 

with some control on e' . This requires a careful analysis of the behavior of the coefficients 
(7?.j, 7^), which are of the following form: 

Lemma 4.7. We have 

(J^vln) = {Cj.lr^ - ^ {Cj,ipx){iJx,lr,) + ^ {Vj,i!x){i^x,lv). 

Proof. We compute 

{nj,-fn) = (^j,7r?) + (Cj,7»7) - ( (^i'^A)^A,7^) - Yl (^i' ^a) (^Aa, 7r?)- 

AeTij AeTi.j 

Using the fact that (V'a)a is a tight frame, we conclude that 

i'P.^rn) - ( Yl {'Pv^Mx.lri) = Y (^J'V'A)(^A,7r,), 

and the lemma is proved. □ 
The threshold was chosen precisely so that (7?.j,7^) ~ {Cj,jj^) for all r] asymptotically, 
i.e., that the two residuals in Lemma 4.7 become asymptotically negligible. A quantitative 
statement of this consideration is 

Proposition 4.3. For any 6 > 0, 

Y (C„^a)(V'a,7.) + Y (^i'^A)(^A,7.) = 0(2-^-(V4-5)), J ^ ^_ 
AeTi,, xeri^ 

In particular, we have 

{V : |(C„7,)| > 2^-(V4-(-^))} c {ry : |(7^„7,)| > 2^^'^'-^^} C : |(C„7,)| > 2^^'/'-^^-^'^^}. 
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Proof. Let 5 > he arbitrary. For proving the first claim, we consider both terms on the 
LHS separately. By Lemmata 3.3 and 4.3, 



Yl ('^i'^A)(^A,7r,) 



AGTi,, 

Since Lemma 3.5 implies that 



<c- J2 l(V'A,7,)l <c-|ri,,|-2-^' 



/4 



we obtain 



For N large enough. 



^ (Cj,?/^a)(^a,77?) 



AeTi 



AeTij 

Secondly, by Lemmata 3.2 and 3.4, 

\{V,.i^x)\<CN 

For N large enough, we have 



-TV 



|fc|>cjv2^T7V^ 



J{x:|x|>Civ2^~3W-} 



By (4.10) and (4.11), also exploiting Lemma 4.3, 



Y (P„^a)(^a,7,) 



(4.9) 



(4.10) 



(4.11) 



< c-2^'^-2-^/' < c-2-^(i/'-^), j^oo. (4.12) 



Now the first claim follows from (4.9) and (4.12). 

The 'in particular'-part can now be derived as a consequence of the first claim by using 
Lemma 4.7. □ 

Lemma 3.6 already proved that WF{V) C Ti^f ■ Our last result in this subsection shows 
that a similar result holds true for the wavefront set of C and the thresholding set T^^'^- 
These two results will be one main ingredient for proving the separation of wavefront sets 
through Single-Pass Alternating Thresholding stated in Theorem 1.3. 

Lemma 4.8. For j sufficiently large, 
Hence, in particular, 

WF{C) n {{b,,k,i, %/) : (j, k, e) e A,} c r{f. 
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Proof. By Parseval, 

l(^A-'7j,(o,fc2),o)| = 27r|(u;/:j-,7j-(o,fc2),o)| 

Apply the change of variables ( = {^i/2^ ,^2) and = 2^d^, 



|(«;£„7,,(o,M,o)l = 27rp ■ 2^/' ■ J W2{-pC,2) j W{\C''^\)V{2^"u{0)dC 



2- 



(4.13) 

where C''"'^ = (Ci) C2/2-') and uj{C) denotes the angular component of the polar coordinates of 
C- As j 00, the integration area is asymptotically (as j — )■ 00) of the form 

S = ([-2,-1/2] U [1/2,2]) X [-2^l\2^l\ 

Letting L5 = 2^^'^^'^~^\ the choice of W and V implies that the dependence of 



[-Ls.U] 3 C2 ^ 1 W{\C^^^)V{2^'^uj{0)dCi 



on j is asymptotically negligible, and that its absolute value is uniformly bounded from 
below. Thus, by (4.13) and taking the rapid decay condition (4.1) on W2 into account, for 
some c > 0, 

7,,(o,fo),o)l >c-2^-/4. j ^^^_p^^yik./2^/-K.dC,. (4.14) 
Finally, again by (4.1), we can conclude that there exists some c' > such that 

j M-pC2)e'^''^'''"^^'dC2 > c, \^k2 e 2^[-p,p]. (4.15) 

Combining (4.14) and (4.15), for sufficiently large j, 

|(^£„7,,(o,..),o)| > 2^-(^/^-^), yk2 e 2^-[-p,p] (4.16) 

which was claimed. 

For the 'in particular '-part, we ffist observe that due to Proposition 4.3, WLOG we can 
consider 

{r/:|(C,,7,)|>2^W4-(e+^))} 

for defining 7^^*^. We then employ the careful bending of the curve as detailed in Section 
4.1, Proposition 4.2, [18, Lem. 7.8], and Proposition 4.1, as well as the fact that WF{wC) = 
{(0,6) : b G [— p, p]} X {0}. This consideration allows us to conclude that the claim follows 
from (4.16). □ 

5. Asymptotic Separation 

This section is devoted to the analysis around and to the proof of Theorem 1.1. We first 
consider an abstract separation setting, which we will subsequently apply to each filtered 
version of an image composed of pointline and curvelike structures. 
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5.1. Abstract Separation Estimate for Thresholding. Suppose we have two tight 
frames $i = {4>i,i)i, $2 = (02j)j in a Hilbert space T-L, and a signal vector S E Ti. We 
assume that all frame vectors are normalized to c, say, i.e., 

||0i,i||2 = c and ||02,j||2 = c foraUi,,;. 

We know a priori that there exists a decomposition 

where is sparse in $1 and 6*2 is sparsely represented in <l>2- 

Abstract Version of One- Step-Thresholding 

Parameters: 

• Signal S. 

• Thresholds ti and ^2- 
Algorithm: 

1) Threshold Coefficients with respect to Frame 

a) Compute q = (S*, (pi^i) for all i. 

b) Apply threshold and set 71 = {i : |cj| > ti}. 

2) Reconstruct and Residualize ^i- Components: 

a) Compute S{ = ^iItj^JS. 

b) Compute R = S - = <l>ilr,-<l>f 5. 

3) Threshold Coefficients with respect to Frame $2 of Residual: 

a) Compute dj = {R,(j)2,j) for all j. 

b) Apply threshold and set T2 = {j '■ \dj\ > ^2}- 
2) Reconstruct ^2- Components: 

a) Compute S2 = ^2'^T2^1R- 

Output: 

• Significant thresholding coefficients: 7i and T2. 

• Approximations to and S'J' and 82- 

Figure 6. Abstract version of One- Step Algorithm to decompose S = + S2. 

Now we consider an abstract version of One- Step as explained in Figure 6. The following 
result provides us with an estimate for the ^2-separation error which One-Step causes. 
Interestingly, both the relative sparsity measure and the cluster coherence are an essential 
part of this estimate similar to the analysis of ii minimization (cf. [18]). 
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Proposition 5.1. Suppose that S can be decomposed as S = + S^. Let Ti, T2, S^, and 
5*2 be computed via the algorithm One- Step (Figure 6), and assume that each component 
is relatively sparse in $j with respect to %, i = 1,2, respectively, i.e., 

\\lr^^lS[h + \\lr,^^lSlh<5. 

Setting /ic = /^c(72, $2; '^'i) , we have 

\\S{ - 5?||2 + 1152^ - Slh < c ■ [(1 + /ie) • \\It,^IS% + (2 + /i,) ■ 5] . (5.1) 

This proposition will be proven in Subsection 8.3.1. 

5.2. Application to the Separation of V and C. We now apply the estimate (5.1) from 
Proposition 5.1 to the following situation: S will be the filtered composition of curves and 
points fj with 5"^ being the pointlike part Vj and 82 the curvelike part Cj. Our two tight 
frames of interest, $1 and $2, will be chosen to be radial wavelets and curvelets, and we notice 
that these are indeed equal- norm as required by Proposition 5.1. Finally the approximation 
to Vj and Cj computed by the algorithm One-Step , i.e., S\ and 82 will be denoted by Wj 
and Cj, respectively. 

Let 5j denote the degree of approximation by thresholded coefficients, i.e., the sum 5j = 
(5j 1 + 5j 2 of the wavelet approximation error to the point singularity: 

and the curvelet approximation error to the curvilinear singularity: 

Further let ^c{%.,ji ^2'-, ^i) denote the cluster coherence 

(/ic)j = /^c(7^,i,^'2;<^'i) = max ^ |(7^,V^a)|, 

the maximal coherence of a wavelet to a cluster of thresholded curvelet coefficients. We then 
have 

Corollary 5.1. Suppose that the sequence of significant thresholding coefficients (Tij), and 
(J2,j) computed via One-Step (Figure 1) has all of the following three properties: (i) asymp- 
totically negligible cluster coherence: 

(/ic)i = /ic(7^,i, $2; $1) ^0, J ^ CX), 
(ii) asymptotically negligible cluster approximation error: 

= + ^i,2 = odl^ilb + WCjh), j 00, 
(Hi) asymptotically negligible energy of the wavelet coefficients of Cj on Tij: 

E |(C„^a)| = o(||P,||2 + ||C,||2), J ^00. 
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Then we have asymptotically near-perfect separation: 

- — ^-r^, — y 0, ? oo. 

lln-||2+||C,||2 

5.3. Proof of Theorem 1.1. We first recall the following result from [18]. 
Lemma 5.1 ([18]). 

\\Vj\\2 + \\C,\\2 = n{2^/^), J^oo. 

It is now sufficient to show that conditions (i)-(iii) in Corollary 5.1 hold true, which is the 
content of the following four short lemmas. Notice that part (ii) is split into two claims. 

Lemma 5.2. 

(/ic)j = max ^ \{lr„i'\)\ -^0, J oo. 

Proof. By Proposition 4.3, it suffices to prove the result for 

{r/:|(C„7,)|>2^W^-(^+^))} 
instead of T2J for 6 > arbitrarily small. By Proposition 4.2, 

with e' < e + 5 < 1/32. Now the claim follows from [18, Lem. 7.7]. □ 
Lemma 5.3. 

'^1,.= E I(^.'V^a)| = o(||P,||2 + ||C,||2), J ^00. 

Proof. By Lemmata 3.2 and 3.4, 
For N large enough and e < 1/32, we have 



~N 



hence, by Lemma 5.1, 

E |(P„V'a)|=o(2^-/2) = o(||P,||2+||C,||2), J ^00. □ 

Lemma 5.4. 

^2,j= E l(C.-'7^)l=0(||P,||2+||C,||2), J ^00. 

Proof. The argumentation is similar to the proof of Lemma 5.2, this time using [18, Lem. 
7.5] instead of [18, Lem. 7.7]. □ 
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Lemma 5.5. 



J2 \{Cj,^x)\=o{\\V,\\2 + \\C,h), j^oo. 



xeri,j 

Proof. By Lemmata 3.3 and 3.5, and by Lemma 5.1, 



I(C„^a)|< Y1 c = cn-c-2^'^ =o{2^/') = o{\\V,h + \\C,h), j^oo. 



for sufficiently large A^. □ 
The conditions of Corollary 5.1 are satisfied, hence Theorem 1.1 is proven. 



6. Approximation of the Wavefront Sets 
This section is devoted to proving Theorem 1.2. 

6.1. Proof of Theorem 1.2 (i). By Lemma 3.5, {b,6) G Ti^f implies that, for each N = 
1,2,..., there is a constant cat so that 

\b/a, \ <CN-2^^, 

hence 

T{f C {6 G R2 : |6| < CN ■ 2^-^"^} X P^. 

Thus 

dps{T{f, WF{V)) = rfp5(ri7, {0} X Pi) < ■ 2^'^"^, 
which, for sufficiently large N, immediately implies Theorem 1.2 (i). □ 

6.2. Proof of Theorem 1.2 (ii). First we observe that, due to Proposition 4.3, WLOG 
we can consider 

{r/:|(C,,7,)|>2^W^-(^+^»} 

instead of 72j with arbitrarily small 6 > 0. From application of Proposition 4.2, [18, Lem. 
7.8], and Proposition 4.1, it follows that 

C{beR': \b\ < CN ■ 2^-(^'+4(^+^)-i)} x [o^2-^^'/^-^'+'\ 

for some c. A similar conclusion as in the proof of Theorem 1.2 (i) then yields 

limsup dpsiT^^f, WF{C)) = 0, 

which is what was claimed. □ 



7. Separation of the Wavefront Sets 
This section is devoted to proving Theorem 1.3. 
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7.1. A crucial Lemma. For proving Tlieorem 1.3, we first state a general lemma on curvelet 
synthesis and the associated wavefront set, which will be later applied to the functions 
E,F,^W^,and E,F,^C,. 

Lemma 7.1. Let Q C x [0,7r) be a compact set in phase space, let {Tj)j>Q be a nested 
sequence of discrete sets such that Tj C Q for all j > jo, and let ((iaj,fe,6»)i>o,(6,6»)6Tj be a 
sequence of complex numbers which satisfies 

\da„b,e\=0{a-^), j^oo (7.1) 

for some m > 0. We further define 

9j = ^ ^b^g ,b,e 

{b,e)(iTj 

and assume that {gj)j>o is a bounded sequence in the Schwartz space. Then 

3 

Proof. Let (6', 6') E fi'^ and consider 

{gj,1a.,,b',e') = ^ da^^b,e{l'aj,b,e,1a^,,b',e')- 

{b,e)eTj 

Hence, by Lemma 4.2 and (7.1), for all = 1,2,..., 

\{gj,la^,,b',9') \ < Cn ■ aj'^ ■ l{|log2(aj/ay)|<3} ^ 1 <10^} " (|& " &'|ay,0')"^- 

{b,e)£Tj 

Thus 

i'+i 

\Q29j,la^,,b',e')\ <cn aj"" ■ ^ l{\e-e'\<io^} ■ {\b - b'\a^,,g')~^ . 
3 3=3'-^ {b,e)eTj 

Since {b', 9') e and Tj C Q for all (6, 6) e Tj (j > jo), for any iV = 1, 2, ...,we have 

(|&-&'|a,„.')~^ = 0(aj^), j'^oo. 

Since m is fixed, we conclude that 

\(Y9j,'ya^,,b',e')\ = 0{af,), j' oo 
j 

for any iV = 1, 2, .... Then [5, Thm. 5.2] implies that {b', 9') ^ WF{J2j 9j)- □ 
The proof of Theorem 1.3 will now be build upon this lemma. 
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7.2. Proof of Theorem 1.3. We start by applying Lemma 7.1 to the situation gj = Cj, 
Tj = T^f, and daj^b,e = {fjilaj,b,e) ■ Observe that (7.1) is satisfied by the decay estimates for 
the curvelet coefficients for wC, Lemma 4.5, and for "P, Lemma 4.4, and by the bound for 
the curvelet coefficients of C, (4.2). f2 can be chosen as J\f^^{a,c,e') with carefully selected 
c and e' due to the considerations in Section 4.3. Then Lemma 7.1 together with Theorem 
1.2 imply 

WF{Y^ Fj ^ Cj) C WF{C). (7.2) 

j 

In a similar way - by an obvious adaption of Lemma 7.1 - we can show 

WFiJ2F,^W,)CWFiV). (7.3) 

j 

Inclusions (7.2) and (7.3) are a significant part of what was claimed, however a stronger 
result is true. In order to prove equality for (7.3), it suffices to show that - since WF{V) = 
{0} X - the term 

j'+i 

is of slow decay as j' — t- oo, i.e., there exists an = 1,2, ... such that this term behaves 
like r2(a^) as a — )■ 0. Similarly, for proving equality for (7.2), it suffices to show that, for all 
{bj'^k',e',dj',k') G WF{C), the term 

i'+i 

j i=i'-i'?e75j 

is of slow decay as j' — )■ oo. By [5] and the comparable result for wavelets, this then implies 
that 

WF{C) C WF{Y Fj * Cj) and WF{V) C WFiJ^ Fj * Wj), 
j j 

and, combined with (7.2) and (7.3), the theorem is proved. 

We now first show slow decay of the term (7.4). For this, we partition the term under 
consideration into the following three terms: 



i'+i 

Y Yl * ^i'.o) = + - 7^13, 



(7.6) 
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where 

i'+i 

Til = E 

Ti2 = {Vj,Fj'ki)j,fi), 
j'+i 

We start estimating Tu. WLOG we can assume that j = j', hence 

Til = Y (Ci,^A)(^A,i"j*^j,o). 

By Lemmata 3.1, 3.3, and 3.5, 

|rn|<c^- Y {\k\)'''<CN- Y ^ 

|fc|<Cjv2i(l-2£:)/(2iV) k=0 

Since 

x^^^dx < const 



l-N 



^1 



for sufficiently large A^, it follows that 

\Tii\<c. (7.7) 
Next we estimate Tyi- By Lemma 3.6, for sufficiently large j', 

Ti2 > c ■ 2^/2. (7.8) 
For Ti3, we ffist observe that WLOG we can assume that j = j', hence 

AeA.ATi,, 

By Lemmata 3.1, 3.2, and 3.4, 

oo 

|Ti3|<c^-2^V2. J2 (|fc|)-^^<c^-2^-/2. J2 k'-'"". 

Since 

/ x'-^'^dx < CM ■ 2-i(i-2-)(^-i)/^, 

iLcjv2-'(i-2^)/(2'V)j 

it follows - by choosing N = 2 - that 

Applying (7.7)-(7.9) to (7.6) implies that the term Tj Pj,'ipj'fi) in (7.4) behaves like 
f2(2-^*^^/^~^''), hence is of slow decay, which was claimed. 
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Finally, we prove slow decay of the term (7.5). By Propositions 4.3 and 4.2, [18, Lem. 
7.8], and Proposition 4.1, and observing that WF{wC) = {(0,6) : b e [-p,p]} x {0}, WLOG 
we might analyze 

j'+i 

J2 ^^'f * 7i',(o,fc^),o), 

i=i'-i ryer2j 

where k'2 G 2-' [— p, p] . For this, we partition the term under consideration into the following 
two terms: 

j'+i 

5Z Yl (^A-'7r?)(7r?,^j *7i',(o,fc^,),o) = T21 - T22, (7.10) 
i=i'-i»7er2j 

where 

T21 = {wCj,Fj-k-fj>^(^o^k'2)fi), 
i'+i 

^22 = 5^ XI (^A-'7»7)(7r;,^i*7j',(o,fcg,o)- 
i=i'-i»)6Aj\r2j 

The term T21 can be directly estimated by Lemma 4.8 - the additional convolution with 
Fj does not affect the asymptotic behavior - as 

|(^/;£„7,,(o,fc^),o)| > 2^'(^/^-^) \/k'2 G 2^-[-p,p]. (7.11) 

Next, we analyze T22 and first notice that WLOG we can assume that j = j'. Thus we 
are left to estimate 

T22 = Y ^^^J' ^J''^'^^ i'yj^kA Fj -k 7j- (o,fe;,),o) 
(j,fc/)eA,\r2,, 

for ^ 2-'[— p, p]. By Lemmata 4.5 and 4.2, and Proposition 4.1 as well as by the definition 
ofX^^{a,c,e') in (4.3), 

IT22I < c^-2^-/4. J2 (|A;-(0,A:;)|)-^ 

{fc:|l(fei/2^fc2/2J/2)-({0}x[-2p,2p])|j2>c-2J(--i)} 
{A::||(fei/2i,fc2/2J/2)-({0}x[-2p,2p])||2>c-2J(-'-i)} 

Since, for sufficiently large TV, 

/ \ixi,X2)\^-''dx2dxi<c-2-'''\ 

J{(a;i,X2):||(xi/2i,a;2/2i/2)-{{0}x[-2p,2p])||2>c-2J(^-l)} 

it follows that 

IT22I < c-2^'(i/^"2=). (7.12) 

Applying (7.11) and (7.12) to (7.10) implies that the term ^j^lj',/:'/') in (7.5) 

behaves like il(2^^^^^~^^), hence is of slow decay, which was claimed. □ 
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8. Proofs 
8.1. Proof of Results from Section 2. 

8.1.1. Proof of Lemma 3.1. Using Parseval, (V'a,b, V'ao.feo) = 27r / ^IJa,b{Oi^ao,bo{OdC^ we con- 
sider 



Due to the scaling property, this term is non-zero if and only if | log2(a/ao)| < 3. Hence 
from now on WLOG we can assume that a = oq. Also, WLOG we may assume that Bq = 0. 
Applying the change of variables ( = a^, 



= J I iy(r) 1 2e-*(''/'^)« de- 



Applying integration by parts, for any k = 1,2, 
\{ipa,b,ipao,bo)\ = 27r-|6/a|~'= 



< 2tt ■\b/a\-'' J \A''[\W{r)\^]\d^. 



Hence 

{l + \b/an-\{^^,,,^Pa,,,„)\< I \W{r)\'d^ + I \A'[\W{r)\']\dt 
Since the integrand is independent on a, and further, for each k = 1, 2, 

{\b/a\)' = {l + \b/a\'f^<'^{l + \b/a\% 

the claim follows. 

8.2. Proofs of Results from Section 4. 

8.2.1. Proof of Lemma 4-4- Using Parseval, {'jaj,b,e,'Pj) = 2ti j 7aj,b,6»(0'P7 (0'^^5 we consider 
la,bM)'PAOdi = j a'"^W{ar)V{{uj-e)/^)e-'''^-W{ar)-r-^'Hi. 

Now WLOG we may consider the special case 6* = 0, so that Re = I. We may also assume 
6o = 0. Apply the change of variables C, = Da^ and d( = a^/'^d^. 



= ! W^^(||Ca||)V(a;(Ca)/v^)||(a-^/^Ci,C2)||-^/'e-(^v.^)'^ciC, 
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where (a = (Ci; \^C2) and uj{(a) denotes the angular component of the polar coordinates of 
(a- Applying integration by parts, for any k = 1,2,..., 

\{la,b,e,Vj)\ 



271 ■ a~'/^ ■ \Dyab\-' 



All^'(||Ca||)V(u;(Ca)/v^)||(a-^/\i,C2)ir^/1e-^(^^/'''')'frfC 



< 2n . a-'^' ■ \D,/ab\-' 1 1 A^[W^2(||Ca||)l^(c^(Ca)/v^)||(a-ni, C2)ir'/1| ^C- 



Hence 



{l+\Dyabn-\{la,b,e,V,)\ 

< 2n.a~'/'J 0W^'(IICa||)||mCa)/v^)|||(a-ni,C2)||-^/^ 

+ |Ani^^(||Ca||)n^(Ca)/v^)]||(a-ni,C2)ir^/^|] dC (8.1) 

Next we show that, for each k, there exists < oo such that 

+ \A''[W'i\\Ca\\)ViuiCa)/V^)]\]dC<Ck, Va>0. (8.2) 



We have 



and 



Hence, by induction, the absolute values of the derivatives of W^^(||Ca||) are upper bounded 
independently of a. Also, 

and 

^Viu;iCa)/V^) = T^^(C^((C1, ■)a)/v^)(Cl) ■^?2(C,a), 

and tedious computations show that both \gi\, \g2\ possess an upper bound independently 
of a. Thus, by induction, the absolute values of the derivatives of V{u{(a)/y/0') are up- 
per bounded independently of a. Also, obviously, both ^||(a~^''^Ci! C2)||~^'^^ as well as 
a ^ 

0(2 



— Il(a ^/^Ci5C2)|| possess an upper bound independently of a. These observations imply 



.2). 

Further, for each k = 1,2 



, . . . , 



= (1 + \Dy,b\')l < |(1 + \D^/ab\'). (8.3) 

To finish, simply combine (8.1), (8.2), and (8.3), and recall that we chose coordinates so 
that ^ = 0. Translating back to the case of general 6 gives the full conclusion. □ 
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8.3. Proofs of Results from Section 5. 

8.3.1. Proof of Proposition 5.1. Proof. Since $i is a tight frame, 
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\\s{-s% 



= ||$ilr,$r5°-<l>ilr,=$r^?||2 

< ||$ilr,$f5°||2 + ||$ilr,^<fr'5?||2. 

Apply relative sparsity of the subsignal 5° and the equal-norm condition on the tight frame 

$1, 

(8.4) 



Next we estimate HS'g — 'S'2||2- We start by using the fact that $2 is a tight frame and also 
employ the definition of the residual i?, 

\\S*2-Slh=\\^2lT,^lR-S% 

< \\<^21tM'^iH'^IS% + ||<l>2lr^$^<l>l$f 5°||2 + \\<^2lTM'^l'^Tf'^IS%. 

Since $1 is a tight frame and the norms of all elements in the tight frame $2 coincide, we 
can conclude that 



11^2-^2° 



2II2 



E 



'2,j, S2)(p2,j\\2 



I ^^{(l)l,uS^){(j)i,i,(f)2,j)(j)2,j\\2 + II 
+ llEE<<^M,^?)(0M,02,,)02,,||2 

ier2 i(^Tf 



< C- $^(|(0M,^2°)lEl('^M,02,,)|) + $^(|(0M,5?)|5^|(0l,,0^ 

+ Ei(<^2,„^2°)i 

Now we have reached the point, where cluster coherence and relative sparsity come into 
play. These notions allow us to derive 



11^2-^2 



2 112 



< C 



^(|(0M,^2°)|-/^c)+$^(|(0M,5?)|-/i< 

< c- [fx,-i\\lr,<^^S% + S) + S]. 
Combining (8.4) and (8.5) proves the lemma. 



1.5) 
□ 
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